feat(examples): NdarrayGraphPlugin — Bevy SIMD-accelerated graph rendering via ndarray polyfill#1
Conversation
End-to-end smoke test verifying the AdaWorldAPI/ndarray SIMD polyfill is reachable from a Bevy downstream crate. Asserts: 1. simd_caps() LazyLock reports the live CPU tier 2. F32x16::mul_add is bit-exact against scalar f32::mul_add 3. integrate_simd advances positions by exactly v * dt 4. integrate_simd_par (rayon × SIMD) is bit-identical to sequential 5. compose_neo4j emits both node and edge palette pixels What it proves: target-cpu propagation, runtime↔compile-time tier agreement, the Pumpkin-derived rasterizer is library-callable, and rayon par_chunks_mut composes cleanly with F32x16::mul_add. Headless: links the full bevy crate and runs MinimalPlugins for one Update tick before exiting via MessageWriter<AppExit>. Verifies the link, no window.
GitHub Actions runners support x86-64-v3 (AVX2) but NOT x86-64-v4 (AVX-512). Unconditionally setting target-cpu=x86-64-v4 would break CI; unconditionally leaving the default would mean the ndarray polyfill never picks its AVX-512 type path even on capable hardware (the ndarray_simd_smoke example proved this is observable: avx512f=true at runtime but PREFERRED_F32_LANES=8 at compile time). This template provides both profiles, opt-in: cargo build → x86-64-v3 (AVX2 baseline, CI-safe) cargo build-avx512 → x86-64-v4 (AVX-512, 16-lane F32x16) cargo run-avx512 → ditto cargo test-avx512 → ditto cargo check-avx512 → ditto Follows the existing Bevy convention of providing .cargo/config_*.toml template files that users copy into the gitignored .cargo/config.toml. Companion to AdaWorldAPI/ndarray PR bevyengine#142 (VBMI gate + Inf clamp + NaN preservation in simd_exp_f32).
… SIMD
Produced by the 12-agent CCA2A round-2 fleet (see ndarray's
.claude/board/AGENT_LOG.md for full agent breakdown). Delivers the
"Bevy works on SIMD" goal: a real Bevy plugin that uses ndarray's
crate::simd polyfill end-to-end for graph rendering, plus a CI workflow,
headless integration tests, a shared palette LUT, and usage docs.
Files added:
- examples/ndarray_graph_plugin.rs (~270 lines) — NdarrayGraphPlugin
with GraphRenderer Resource, startup seeder (64 nodes in circle layout,
80 ring + cross edges), tick_renderer + render_to_framebuffer Update
systems. Uses crate::simd::F32x16::mul_add via Renderer::tick →
integrate_simd, and compose_neo4j (Pumpkin-derived rasterizer) into a
long-lived 512x512 Framebuffer that gets palette-expanded to RGBA8
and blitted into a Bevy Image displayed as a Sprite.
- examples/ndarray_graph_palette.rs — shared PALETTE_LUT [16 x RGBA8]
+ blit_u8_palette_to_rgba helper, both imported by the plugin via
#[path = "ndarray_graph_palette.rs"] mod palette.
- examples/ndarray_graph_plugin_tests.rs — 5 headless integration tests
(resource init, startup seed, F32x16::mul_add position advance,
compose_neo4j pixel emission, simd_caps runtime detect). Runs as
cargo run --example ndarray_graph_plugin_tests; all pass.
- examples/README_NDARRAY_PLUGIN.md — usage doc (build, run, what it
shows, architecture ASCII diagram, compile-time vs runtime tier
explanation, companion files).
- .github/workflows/ndarray-smoke.yml — GitHub Actions x86-64-v3
baseline build (CI runners don't have AVX-512); installs Bevy system
deps (libwayland-dev / libasound2-dev / libudev-dev); clones sibling
ndarray via the same branch name with master fallback; cargo check
on ndarray_simd_smoke + ndarray_graph_plugin.
Cargo.toml: two [[example]] entries (ndarray_graph_plugin,
ndarray_graph_plugin_tests).
Verified (Sapphire Rapids, x86-64-v3 build):
cargo check --example ndarray_graph_plugin: clean
cargo check --example ndarray_graph_plugin_tests: clean
cargo check --example ndarray_simd_smoke: clean (regression-safe)
cargo run --release --example ndarray_graph_plugin_tests:
[test 1] PASS: GraphRenderer resource present, tick_count=0
[test 2] PASS: front.len=2 edges.len=1
[test 3] PASS: position[0] 10.0 -> 10.016666 (= 1.0 * DT_60 + 10.0,
confirms F32x16::mul_add polyfill ran inside Bevy)
[test 4] PASS: compose_neo4j emitted 106 non-zero pixels
[test 5] simd_caps: avx512f=true avx2=true fma=true; lanes=8
[test 5] PASS: x86_64 has avx512f or avx2
Notable: the [test 5] line surfaces the compile-time vs runtime mismatch
(lanes=8 because CI-baseline cargo build, but CPU has avx512f=true).
cargo run-avx512 from .cargo/config_ndarray_simd.toml (already on this
branch) lifts that to lanes=16.
Architecture note for GPU-less hosts (Railway / HuggingFace Spaces /
Cloudflare / serverless): this plugin is a CPU-only path. The
Pumpkin-derived framebuffer was designed for the no-GPU case — palette
indices on CPU, 4 bpp wire format via Framebuffer::pack(). The audit
sub-fleet confirmed bevy_pbr / atmosphere / skinning paths are
GPU-offloaded on hosts with GPUs, but this plugin remains entirely
SIMD-CPU and works identically without a GPU.
|
Welcome, new contributor! Please make sure you've read our contributing guide, as well as our policy regarding AI usage, and we look forward to reviewing your pull request shortly ✨ |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ca4a973aea
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| # checkout for the ndarray_simd_smoke example (proves crate::simd::F32x16 | ||
| # routes correctly through to AVX-512/AMX/AVX2/NEON from a downstream crate | ||
| # and that integrate_simd_par composes rayon × SIMD bit-identically). | ||
| ndarray = { path = "../ndarray", features = ["rayon"] } |
There was a problem hiding this comment.
Avoid requiring a sibling ndarray checkout
When the repository is checked out normally without a manually cloned ../ndarray, this path dev-dependency prevents Cargo from loading the workspace at all, not just these new examples. I verified this in the repo with cargo metadata --no-deps --format-version 1, which fails while reading /workspace/ndarray/Cargo.toml; that means ordinary commands for unrelated packages/examples are blocked unless every developer and CI job creates the sibling checkout first. Please avoid a mandatory parent-directory path dependency in the root manifest, or gate/patch it so the default workspace remains loadable.
Useful? React with 👍 / 👎.
|
You added a new example but didn't add metadata for it. Please update the root Cargo.toml file. |
… PR #1 Three classes of CI failure caught by the PR #1 review pass: 1. CODEX P1 — path = "../ndarray" broke cargo metadata workspace-wide on any host without the sibling checkout. Every cargo command on every package failed. Switched to a git dev-dep on https://github.com/AdaWorldAPI/ndarray.git branch master. 2. UPSTREAM CI MATRIX FAILURE — adding the ndarray dev-dep made upstream bevy CI try to fetch + build ndarray on macOS / Windows runners, where ndarray's AMX inline asm + Linux prctl path do not yet compile. Two gates layered: a. Target.cfg-gated the ndarray dev-dep to `cfg(all(target_os = "linux", target_arch = "x86_64"))` so non- supported platforms never try to resolve the dep. b. Added `ndarray-examples = []` feature + `required-features = ["ndarray-examples"]` on all three [[example]] entries so `cargo build --examples` without the feature does not pick them up at all. Upstream CI does not enable this feature; our .github/workflows/ndarray-smoke.yml does. 3. ZIZMOR security findings on the workflow: - "Workflow does not contain permissions" → added explicit `permissions: contents: read` at workflow level. - "code injection via template expansion" → the `${{ github.head_ref || github.ref_name }}` in the run: block was a code-injection surface (a maliciously-named branch could inject shell). Removed entirely: with the git dev-dep change above, the workflow no longer needs to clone ../ndarray, so the template expansion site is gone. - "unpinned action reference" → pinned actions/checkout@v4 to commit SHA 692973e3d937129bcbf40652eb9f2f61becf3332 (v4.1.7) and dtolnay/rust-toolchain@1.95.0 to its commit SHA f04cf2e09f5b6448b46c0aa9893a76ee36ed64c2. Verified: cargo check --example ndarray_graph_plugin --features ndarray-examples → clean (git dep resolves, plugin compiles, ndarray builds via git) cargo check --examples (no feature) → does NOT touch the ndarray_* examples (required-features works)
| # The action treats "1.95.0" as a toolchain version, but the action ref | ||
| # itself must be a commit SHA. Commit f04cf2e09f5b6448b46c0aa9893a76ee36ed64c2 | ||
| # corresponds to the stable tag. | ||
| - uses: dtolnay/rust-toolchain@f04cf2e09f5b6448b46c0aa9893a76ee36ed64c2 |
| steps: | ||
| # Pinned to commit SHA per zizmor unpinned-action rule on PR #1. | ||
| # v4.1.7 corresponds to commit 692973e3d937129bcbf40652eb9f2f61becf3332. | ||
| - uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 |
|
You added a new feature but didn't update the readme. Please run |
Summary
Real Bevy plugin demonstrating SIMD-accelerated graph (nodes+edges) rendering using AdaWorldAPI/ndarray's
crate::simd::F32x16polyfill and the Pumpkin/Mindcraft-derived palette framebuffer. Produced by a 12-agent CCA2A fleet (full breakdown in ndarray's AGENT_LOG.md).What ships
examples/ndarray_graph_plugin.rs(~270 lines) —NdarrayGraphPluginwithGraphRendererResource, startup seeder (64 nodes circle layout + 80 edges),tick_renderer+render_to_framebufferUpdate systems. Usescrate::simd::F32x16::mul_addviaRenderer::tick → integrate_simd, andcompose_neo4j(Pumpkin rasterizer) into a long-lived 512×512Framebuffer→ palette-expanded RGBA8 → BevyImage→Sprite.examples/ndarray_graph_palette.rs— sharedPALETTE_LUT(16 × RGBA8) +blit_u8_palette_to_rgbahelper.examples/ndarray_graph_plugin_tests.rs— 5 headless integration tests (all pass on Sapphire Rapids).examples/README_NDARRAY_PLUGIN.md— usage doc + architecture diagram..github/workflows/ndarray-smoke.yml— CI workflow targeting x86-64-v3 (CI runners don't have AVX-512).Test results
Architecture
GPU-less hosts (Railway / HuggingFace / Cloudflare / serverless)
This plugin is a pure CPU path by design. The Pumpkin/Mindcraft framebuffer was built for the no-GPU case — palette indices on CPU, 4 bpp wire format via
Framebuffer::pack(), client paints. The fleet's audit confirmed that Bevy'sbevy_pbr/atmosphere/skinningpaths are GPU-offloaded on hosts with GPUs (so SIMD wins there are unreachable), but this plugin works identically on GPU-less hosts because nothing in its hot path touches wgpu's compute layer.CI
.github/workflows/ndarray-smoke.ymltargets stockubuntu-latestrunners (x86-64-v3, AVX2 baseline). Does NOT usecargo build-avx512(would SIGILL on CI runners that lack AVX-512). Local AVX-512 builds use the alias from.cargo/config_ndarray_simd.toml(already on this branch in commit67182a9).Companion PR
ndarray-side
SimdCapsextensions (AMX/VNNI/BF16 fields) ship in AdaWorldAPI/ndarray claude/simd-caps-amx-round2. Not a hard dependency — this plugin works against ndarray master as-is.Audit deferrals (fleet output)
The 6 audit agents inventoried Bevy upstream SIMD opportunities. Top findings:
U8x32polyfill insimd_avx2.rs(currently absent — keystone work)SimdCaps; 1 (Linux prctl) is per-thread and will SIGILL on rayon workers if AMX paths get rayon-parallelized laterGenerated by Claude Code